TALN at SemEval-2016 Task 14: Semantic Taxonomy Enrichment Via Sense-Based Embeddings
نویسندگان
چکیده
This paper describes the participation of the TALN team in SemEval-2016 Task 14: Semantic Taxonomy Enrichment. The purpose of the task is to find the best point of attachment in WordNet for a set of Out of Vocabulary (OOV) terms. These may come, to name a few, from domain specific glossaries, slang or typical jargon from Internet forums and chatrooms. Our contribution takes as input an OOV term, its part of speech and its associated definition, and generates a set of WordNet synset candidates derived from modelling the term’s definition as a sense embedding representation. We leverage a BabelNet-based vector space representation, which allows us to map the algorithm’s prediction to WordNet. Our approach is designed to be generic and fitting to any domain, without exploiting, for instance, HTML markup in source web pages. Our system performs above the median of all submitted systems, and rivals in performance a powerful baseline based on extracting the first word of the definition with the same partof-speech as the OOV term.
منابع مشابه
VCU at Semeval-2016 Task 14: Evaluating definitional-based similarity measure for semantic taxonomy enrichment
This paper describes the VCU systems that participated in the Semantic Taxonomy Enrichment task of SemEval 2016. The three systems are unsupervised and relied on dictionary-based similarity measures. The first two runs used first-order measures (Lesk and First-order vector), and the third run used a second-order measure (Second-order vector). The first-order measures obtained a higher Wu & Palm...
متن کاملVCU at Semeval-2016 Task 14: Evaluating similarity measures for semantic taxonomy enrichment
This paper describes the VCU systems that participated in the Semantic Taxonomy Enrichment task of SemEval 2016. The three systems are unsupervised and relied on dictionary-based similarity measures. The first two runs used first-order measures (Lesk and First-order vector), and the third run used a second-order measure (Second-order vector). The first-order measures obtained a higher Wu & Palm...
متن کاملDeftor at SemEval-2016 Task 14: Taxonomy enrichment using definition vectors
In this paper we describe the participation of the Joint Research Centre, EC, in task 14 Semantic Taxonomy Enrichment at SemEval 2016. The algorithm which we propose transforms each candidate definition into a term vector, where each dimension represents a term and its value is calculated by TF.IDF. We attach the candidate term as a hyponym to the WordNet synset with the most similar definition...
متن کاملMSejrKu at SemEval-2016 Task 14: Taxonomy Enrichment by Evidence Ranking
Automatic enrichment of semantic taxonomies with novel data is a relatively unexplored task with potential benefits in a broad array of natural language processing problems. Task 14 of SemEval 2016 poses the challenge of designing systems for this task. In this paper, we describe and evaluate several machine learning systems constructed for our participation in the competition. We demonstrate a...
متن کاملDuluth at SemEval 2016 Task 14: Extending Gloss Overlaps to Enrich Semantic Taxonomies
This paper describes the Duluth systems that participated in Task 14 of SemEval 2016, Semantic Taxonomy Enrichment. There were three related systems in the formal evaluation which are discussed here, along with numerous post–evaluation runs. All of these systems identified synonyms between WordNet and other dictionaries by measuring the gloss overlaps between them. These systems perform better ...
متن کامل